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Summary: A general framework is proposed fo r Bayesian mo del-based designs of Phase I cancer 



trials, in which a general criterion for coherence (|Cheung 



20051 ) of a design is also developed. This 



framework can incorporate both "individual" and "collective" ethics into the design of the trial. We 
propose a new design which minimizes a risk function composed of two terms, with one representing 
the individual risk of the current dose and the other representing the collective risk. The performance 
of this design, which is measured in terms of the accuracy of the estimated target dose at the end 
of the trial, the toxicity and overdose rates, and certain loss functions reflecting the individual and 
collective ethics, is studied and compared with existing Bayesian model-based designs and is shown 
to have better performance than existing designs. 
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1. Introduction 

A Phase I trial for a new treatment is generally intended to determine a dose to use in 
subsequent Phase II and III testing. Phase I cancer trials have the additional complexity 
that the treatment in question is usually a cytotoxic agent and the efficacy usually increases 
with dose, and therefore it is widely accepted that some degree of toxicity must be tolerated to 
experience any substantial therapeutic effects. Hence, an acceptable proportion p of patients 
experiencing dose limiting toxicities (DLTs) is generally agreed on before the trial, which 
depends on the type and severity of the DLT; the dose resulting in this proportion is thus 
referred to as the maximum tolerated dose (MTD). In addition to the explicitly stated 
objective of determining the MTD, a Phase I cancer trial also has the implicit goal of safe 
treatment of the patients in the trial. However, the aims of treating patients in the trial and 
generating an efficient design to estimate the MTD for future patients often run counter to 
each other. Commonly used designs in Phase I cancer trials implicitly place their focus on 
the safety of the patients in the trial, beginning from a conservatively low starting dose and 
escalating cautiously. Escalation is further slowed by the assignment of the same dose to 
groups of consecutive patients, as in the widely used 3-plus-3 design, which is convenient to 
administer and short e ns tri al duration by simultaneously following patients in groups of 3. 



Von Hoff and Turner! (119911 ) have documented that the overall response rates in these Phase I 



trials are low, and substantial numbers of patients are tre ated at doses that are retrospective 



O'Quiglev. Pepe and Fisherl (11990h 



y 



found to be non-therapeutic. Moreover, as pointed out by 
these designs are very inefficient for estimating the MTD, which is implied by the 3-plus-3 
design to correspond to the case p = 1/3. They proposed a Bayesian model-based design, 
called the "continual reassessment method" (CRM), to choose the dose levels sequentially, 
making use of all past data at each stage. 



More than ninety new Phase I methods were published between 1991 and 2006 (IRogatko. Schoeneck. Jon 
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2007), and there have been several reviews of the new methods (e.g-. lRosenberger and Haines . 



2002) ■ In this paper we focus on Bayesian model-based designs and Section [2] describes a 
general framework to develop and analyze them. As shown in Sections [2] and [3], this framework 
allows one to incorporate the competing aims of a Phase I cancer trial by choosing the loss 
function accordingl y. It also enab les one to derive certain desirable properties of the design, 



such as coherence (jCheung 



20051 ). from the loss function, or to enforce them by using simple 



reformulations in this framework. Section H] provides implementation details and gives a 



simulation study comparing Bayesian designs t hat correspond to different 



the setting of a colon cancer trial considered by 



Babb. Rogatko and Zacksl (I1998h . 



oss functions in 



2. Posterior Distributions, Loss Functions and Sequential Dose Determination 

A commonly used model-based approach to Phase I cancer clinical trial design assumes the 
usual logistic regression model for the probability Fg(x) of DLT at dose level x: 

F e (x) = l/(l + e-^^), (1) 

in which (3 > and 9 = (a, (3) is unknown and to be estimated from the observed pairs (xj, 
where yi = 1 if the ith subject, treated at dose x«, experiences DLT and i/i = otherwise. 
The frequentist approach to inference on 9 uses the likelihood function and estimates 9 by 
maximum likelihood, while the Bayesian approach assumes a prior distribution of 9 and uses 
the posterior distribution for inference on 9. 
Denote the MTD by r] = F e _1 (p) and the posterior distribution of 9 based on (x±, y±j, . . . , (xk, yu) 



by Ilfc, and let n denote the prior distribution. The Bayes estimate o 
square d error loss is the posterior mean En k (rj), and the CRM proposed by 



77 with respect to 



O'Quigley. Pepe and Fisher 



(119901 ) uses this posterior me an to set the dose for the next p atient, i.e., Xk+i = Eu k {rj). 



Instead of the posterior mean, 



Babb. Rogatko and Zacksl ( 119981 ) proposed to set Xk+i equal 



to the cu-quantile of the posterior distribution, where 0<u;<l/2is chosen to be slightly less 
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than p in their examples. This design is called "escalation with overdose control" (EWOC) 
and u is called the "feasibility bound." A sequence of doses x n is called "Bayesian feasible" 
at level 1 — to if Pn n -i(v ^ % n ) ^ 1 — ui for all n ^ 1, and the E WOC doses are optimal 



among Bayesian-feasible ones; see 



Zacks. Rogatko and Babb 



(1998). 



Note that the dose for the nth patient in CRM or EWOC depends only on the posterior 
distribution II n _i, i.e., x n is a functional f(U n _i) of H n -i- This functional defines {Uk : k ^ 
0} as a Markov chain whose states are distributions on the parameter space O and whose 
state transitions are given by the following. 

Bayesian updating scheme: Given current state II (which is a prior distribution of 6), let 
x = /(IT) and generate first 9 from II and then y ~ Bem(Fg(x)). The new state is the 
posterior distribution of 6 given (x,y). 



The functional x = /(II) for CR M is Euin), which minimizes the 



loss Eji[(t] — x) 2 ]. As pointed out by 



Babb. Rogatko and Zacksl ( 119981 ). the symmetric nature 



expected squared error 



of the squared error loss may not be appropriate for modeling the toxic response to a cancer 
treatment. Instead of squared error loss, EWOC with feasibility bound u uses the functional 
x = x(U) that minimizes the asymmetric loss function En[£(i], x)], where 

{u(ri — x), if x ^ ii 

(2) 
(1 — iS){x — rj), if x ^ rj. 

More generally, we can consider other loss functions £(9,x) and define x(U) that attains 
mina; En[£(8, x)}. In particular, the following example gives a response-based version of EWOC. 

Example 1: Inverted overdose control. The EWOC loss function (T2]) penalizes an overdose 
x > i] by the amount (1 — u)(x — rj), and an under-dose x < r\ by the amount u(x — rj). 
However, a dose x deemed "too large" on this scale may actually correspond to a probability 
of DLT not much larger than the target rate p depending on the dose-response curve, making 
x a relatively desirable dose. Likewise, a small value of \x — 1]\ may correspond to a large 
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r 



discrepancy between the actual DLT probability Fg(x) and p. lHardwick and Stout 



(2001) 



suggest to measure the excess/deficit of the DLT rate on the "probability scale." Taking 
< 7 < 1/2, this leads to the "inverted" loss function 

j(p-Fe(x)), ifx^r] 
(l-j)(F e (x)-p), ifx^rj. 



£(0,x) 



(3) 



Remark: The loss when the dose falls below rj, measured by the difference u(r] — x) in fl2]) and 
y(jp — Fg(x)) in Q, should ideally be measured by the difference in response rates at rj and x, 
respectively, when efficacy data, taking the value 1 if the patient responds to the treatment 
and otherwise, are also available besides the toxicity data. Note, however, that this involves 
bivariate efficacy-toxicity data. While many existing designs solely consider toxicity outcomes 



Li. Durham and Flournoyl 



(120081 ). and 



number of authors, including 
p 

Kpamegan and Flournovl (120011) . iThall and Cook! (120041 ). 



1995 


). 


Hardwick and Stout 


2001 




Dragalin and Fedorov 


(2006^ 


, Draealin. Fedorov 



Pronzatol (l2010f ). When efficacy responses are available, the minimum effective 
dose (MED) is of interest, i.e., the lowest dose at which some desired proportion of positive 
efficacy responses is attained. When both efficacy and toxicity data are available, the optimal 
safe dose, which is the dose between the MED and the MTD maximizing the probability of 
simultaneous efficacy and non-toxicity, is of interest. Since this paper focuses on univariate 
toxicity data, we consider elsewhere better alternatives to ([3]) for x ^ i] that also require 
efficacy data. 



Noting that the explici t 



y sta ted objective of a Phase I cancer trial is to estimate the MTD, 
Whitehead and Brunierl (Il995l ) conside red Bayesian sequential designs that are optim al, in 



some sense, for this estimation problem. lHaines. Perevozskava and Rosenbergerl (120031) made 



use o 



1992 : 



the theory of optimal design of e xperiments (IFedorov 



Dette. Melas and Pepelyshevl . 



1972 



Atkinson and Donev 



2004 ) to construct Bayesian c- and .D-optimal designs, 
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and further imposed a relaxed Bayesian feasibility constraint on the design to avoid highly 
toxic doses. Optimal design theory involves a design measure £ on the dose space X, and 
a sequential design updates the empirical design measure £ n _i at stage n by changing it to 
£ n with the addition of the dose x n . The empirical measure £ n of the doses x\,...,x n up 
to stage n can be represented by £ n = X^=i where 5 X is the probability measure 
degenerate at x. We let ||£|| denote the number of X{ (not necessarily distinct) in the support 
of £. Thus ||£ n || = n and ||£ || = 0, with £ being the zero measure on X. To include the 
construction of sequential Bayesian optimal designs as a special case of our general approach, 
we can modify the preceding procedure that minimizes Ejj[£(9,x)] to choose the next dose 
based on the current posterior distribution II, by including the current design measure £ in 
the loss function. 



Example 2: Bayesian c- or D '-optimal designs. As described by lHaines. Perevozskaya and Rosenberger 
(120031 ). optimal design theory is concerned with choosing a design measure £ to minimize a 
convex function \I/ of the information matrix M(0,£) = J I(9,x)d£(x), where 1(9, x) is the 
Fisher information matrix at design point x: 

( 1 N 

1(9, x) 



'1 + e a+ ^) 2 



1 x 



y x x 2 j 



The convex function \l/ is associated with the optimality criterion, e.g., ^f(M) = — logdet(M) 
for D-optimality and \fr(M) = c'M~ l c for c-optimality. Since 9 = (a, 0) is unknown, 
the frequentist approach uses a sequential design that replaces 9 in M(9, £ t ) by its max- 
imum likelihood estimate at every stage t. The Bayesian approach puts a prior distribu- 
tion IIo on 9 and minimizes J ty(M(9,£t))dUo(9). Noting that this Bayesian approach does 
not accomodate the fact that patient s are assigned doses sequentially in Phase I trials, 



Haines. Perevozskaya and Rosenberger 



( 2003 



Section 5) propose to start the optimal design 



after an initial sample of k patients so that the dose x of a patient after this initial sample 
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can be determined by minimizing 

J ^>({kM(9^ k ) + I(9,x)}/(k + l))dU k (9), (4) 

where is the empirical measure of the initial sample of design points and II/. is the posterior 
distribution of 9 based on the initial sample. 

We can easily extend our loss function approach to Bayes sequential designs by including £ 
as an argument of the loss function in this setting. Let II be the current posterior distribution 
of 9 and £ be the current empirical design measure. Define 

i(6,x',t) = ¥(M(0,t Hx} )), where £ +{x} = Wf^- ( 5 ) 

The sequential Bayes optimal design chooses the next design level x that minimizes En£(9, x; £). 
The measure £,+{ x } in © represents the new empirical measure obtained by adding x to the 
support of £, with = ||£|| + 1. We can also impose a relaxed feasibility constraint in 

the choice of x: 

Minimize En£(9,x;£) subject to Pn(j]<x)^u, (6) 



as m 



Haines. Perevozskaya and Rosenbergerl (120031 ). where r/ = F e x (q) with q ^ p and 



is a prescribed positive constant. If q — p, then rj = rj and the constraint corresponds to 
requiring the doses to be Bayesian feasible (see the description of EWOC above). 



3. Coherence and Dilemma Between Individual and Collective Ethics 

The preceding section has focused on determining the next dose by minimizing En[£(9, x)], 
where II is the current posterior distribution and I is a loss function incorporating the trial's 
main objective into the Bayes sequential design. In Example 2 we have shown how additional 
information, such as the empirical measure of previous design points, can be included in the 
minimization problem to determine the dose. The following subsections extend this idea to 
address two important issues in Phase I cancer clinical trial designs. 
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3.1 Coherence and Its Enforcement 
Motivated by ethical concerns, 



Cheung (120051 ) introduced coherence principles for sequential 



dose escalation or de-escalation. A dose sequence is said to be "coherent" if a higher (re- 
spectively, lower) dose is not given to the next patient when the current patient experiences 
(respectively, does not experience) DLT. In particular, CRM and EWOC are coherent and 
the following theorem, whose proof is given in the Appendix, provides conditions for the 
coherence of a Bayes sequential design that minimizes the posterior loss at every stage. 



Theorem 1: Suppose that the dose space is a finite interval and that £(r],x) is convex 
in x for every fixed rj. Assume that for fixed x > x' , £(r], x) — £(r], x') is non-increasing in rj. 
Then the dose sequence x n = argmin^ £7n„_i^(?7, x) is coherent. 

Theorem [1] shows that CRM is coherent since £{t],x) = (77 — x) 2 is convex and 

£(rj, x) — £(rj, x') = —2r](x — x') + x 2 — (x') 2 

is non-increasing in rj for x > x' . The loss function fl2]) associated with EWOC also satisfies the 
assumption of Theorem [TJ which therefore shows the coherence of EWOC. The loss functions 
in Examples 1 and 2, however, may not satisfy the assu mptions of Theorem JH M oreover, a 



20041 ). in which 



modification of EWOC recommended by its proponents ( jBabb and Rogatko , 
the feasibility bound is escalated throughout the trial from a low starting value to 1/2 at the 
end of the trial, does not satisfy the assumptions of Theorem [1] and it indeed exhibits slight 
incoherence in the simulation studies in Section HI This can be understood by noting that, 
toward the end of the trial, the posterior distribution does not change much from patient to 
patient, and that an increase in the feasibility bound may overwhelm the slight downward 
shift in the pos terior fo ll owing an outcome y = 0, causing a dose higher than the previous 



to be assigned. 



Cheund (12005 



p. 865) also found a certain two-stage modification of CRM 



to be incoherent. On the other hand, we can enforce coherence by modifying x n = /(Il n _i) 
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into x n = /(II n _i,a; n _i,y n _i), where 



m,x*,y)={ 



arg min^* E n £(r), x) if y = 1 , 
arg min^* Sn^fa, x ) if 2/ = 0- 



(7) 



3.2 Treatment of Current Patient versus Information for Future Patients 
We have noted in Section [2] that CRM or EWOC treats the next patient at the dose x that 
minimizes En[£(9, x)} for £(r], x) given by (rj — x) 2 or by (|2J), where II is the current posterior 
distribution. This is tantamount to dosing the next patient at the best guess of 77, where 
"best" means "closest" according to some measure of distance from 77. On the other hand, a 
Bayesian c- or D-optimal design aims at generating doses that provide most information, as 
measured by the Fisher information matrix of a design measure, for estimating the dose- 
toxicity curve to benefit future patients. To resolve this dilemma between treatment of 
patients in the t r ial an d efficient experimental design for post-trial parameter estimation, 



Bartroff and Lail (120101 ) considered the finite-horizon optimization problem of choosing the 



dose levels x%, x 2 , ■ ■ ■ , x n sequentially to minimize the "global risk" 



E Un ^2h(rj,Xi) + g(r)n,v) 

-i=l 

in which Ilo denotes the prior distribution of 6, h(j], Xi) represents the loss for the ith patient 
in the trial, rj n is the terminal estimate of the MTD and g represents a terminal loss function. 
The optimizing doses X{ depend on n — i, where the horizon n is the sample size of the 
trial, and therefore are not of the form Xi = /(H-i) considered in Section [2j In terms of 
"individual" and "collective" ethics, note that (jSj) measures the individual effect of the dose 
Xk on the fcth patient through h(r],Xk), and its collective effect on future patients through 

J2i>k h (v,Xi) +g(vn,v)- 
By using a discounted infinite- horizon version of (|SD, we can still have solutions of the form 

Xi = for some functional / that only depends on IT_i. Specifically, take a discount 
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factor < 5 < 1 and replace (jHJ) by 



(9) 



_i=l 

as the definition of global risk. Note that this global risk measures the individual effect of the 
dose Xk on the fcth patient through h(rj, Xk), and its collective effect on future patients through 
Tl,i>k h(r],Xi)5 l ~ k . This means the myopic dose x^ that minimizes En kl [h(j], x)] for treating 
the kth patient has to be perturbed such that it also helps to create a more informative 
posterior distribution IT^. that is used for dosing future patients. Note that does not have 
the term g{r) n ,v) appearing in the finite-horizon problem (jSJ), but even without this term, 
the global risk flU]) still captures the collective effect of the doses, as indicated above. As we 
have pointed out in Section [2J if Xj is of the form for all i, then {Uj, : k ^ 0} is a 

Markov chain whose states are distributions of 9 and undergo Markovian dynamics described 
by the updating scheme for posterior distributions. In the context of the present problem 
of minimizing (J2J), the optimal expected loss V^n) at state IT (posterior distribution of 9) 
satisfies Bellman's dynamic programming equation 

V(U) = inf E u {h( V ,x) + SE u V(U +{x} )}, (10) 

X 

where Il+ja;} is the new posterior distribution of 9 after (x, y) is observed, with y ~ Bem(Eg(x)) 
and 9 ~ II; see the Bayesian updating scheme in Section [2j For finite-stat e contr olled Markov 



Bertsekasl ( 120071 . Section 1.3). 



chains, iteration is a commonly used method to solve (jTOjl : see 
In the present case, not only is the state space infinite, but it is also infinite-dimensional 
(space of all posterior distributions of 9), making dynamic programming intractable. 

The main complexity of the infinite-horizon problem is that the dose x for the next patient 
involves also consideration for future patients who will receive optimal doses themselves; 
these future doses depend on the future posterior distributions. A simple way to reduce the 
complexity is to consider two (instead of infinitely many) future patients. This amounts to 
choosing the next dose x to minimize En£(j], x; II) when the current posterior distribution 
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of 9 is II, where 

£(r],x;U) = h(r],x) + \E n {E u [h(r}' , x')\x x = (11) 

in which rf = Fg, (p) with 9' ~ IT, and II' and x' are defined below. The first summand in (TTTj) 
measures the (toxicity) effect of the dose x on the patient receiving it. The second summand 
considers the patient who follows and receives a myopic dose x' which minimizes the patient's 
posterior loss; the myopic dose is optimal because there are no more patients involved in 
ffTTj) . The effect of x on this second patient is through the posterior distribution II' that 
updates II after observing (xx, yi), with x\ = x. Since yi is not yet observed, the expectation 
outside the curly brackets is taken over y\ ~ Bern(F6i(x)), with 9 ~ II. For example, when 
implemented with h(rj, x) given by the EWOC loss function ([2]), this proposal can be viewed 
as a modification of EWOC since it utilizes its loss function but adds an additional term to 
represent the effect on future patients. 

Unlike < 5 < 1 in the discounted infinite-horizon problem, the choice of A > in ( TTTT) 
can exceed 1 and reflects the balance between the collective ethics in generating information 
for future patients and the individual ethics for the patient receiving the dose. Although 
we use here a single patient to represent all patients following the one receiving the next 
dose, because the posterior distributions also change successively, the doses are functionals 
of these posterior distributions. 



4. Implementation and a Simulation Study 

In this section we first describe three main components in the implementation of the above 
Bayesian sequential designs and then evaluate their performance in a simulation study. 
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4.1 Updating the Posterior Distribution 

Letting rj denote the MTD and p = F g (x min ), we follow 

transform (a, fi) in (OQ) to (p, rf) via the formulas 



Babb. Rogatko and Zacksl (119981 ) to 



a 



fi 



Zminlog(jP 1 ~ 1) ~ ??log(p 1 ~ jQ 
T] ^min 

\ogjp- 1 - 1) - logb -1 - 1) 

) 



and therefore 

a + fix 



(x - 7]) log(p 1 - 1) - (x - x min ) Iog(p 1 - 1) 



G(x,p,rj). 



T) ^min 

We assume that the joint prior distribution of (p, rj) has density 7r(p, 77) with support on 



[0,p] x [x m m , x-msae] ■ Therefore the J-fc_i-posterior density of (p,rj) is 



i=l 



1 !/« 



1 -)- e -G(Xi,p,T)) 



I _)_ e G(xi,p,n) 



1-3/i 



(12) 



where 



c- 1 



. fc-i 

n 



8=1 



X _|_ e G(xi,p,ri) 



i— yi 



7r(p, T))dpdr). 



1 4- e-Gix^ptf) 

The marginal J-fc_i-posterior distribution of 77 is then J Q P n k ^ 1 (p,T])dp, and the CRM and 
EWOC doses based on are the mean and w-quantile of this distribution, respectively. 



4.2 Computation of Eu£(j],x) and its Minimizer in Sections^ and \3.1\ 

The integrals in (fT2l) can be evaluated by using a numerical double-integration routine involv- 
ing Gaussian quadrature in MATLAB. This can be used to evaluate En£(r), x) for a posterior 
distribution II. We can find the minimum of Eu£{t], x) over x by a grid search in [x m i n , x max ], 
or by using gradient descent if i is smooth. For computation of the constrained Bayesian opti- 
mal design ([6]), a constrained nonlinear optimization routine in MATLAB can be used in con 



j uncti on with numerical integration, as outlined in 
(hood , p. 593). 



Haines. Perevozskaya and Rosenberger 
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4.3 Minimization of Eu£(r),x;H) in Section HO 

While MCMC or rejection sampling can be used to c ompute ffTTl) for any candidate dose x, 



importance sampling (e.g., 



Robert and Casella . 



2004 Chapter 3.3) is a simple, robust alter- 



native that takes advantage of the fact that just an expectation with respect to the posterior 
distribution is needed. Letting n denote the uniform distribution of the transformed coor- 
dinates (/?, 77) over [0,p] x [x m i n , x max ], we have 

En£(v, x; IT) « iT 1 V i( Vb , x; II) (13) 

for large B, where (pb,fjb), b = 1, ... ,B, are i.i.d. and generated from n . Letting U + { Xty } 
denote the posterior distribution obtained from U by including (x, y) and letting x' = 
x'(U + { x tV }), the nested expectation in (fTTj) can be similarly approximated by using 

En[h(M\ Xl = x,y] = E n+{xy} h( V ',x') « B-^h(^ f^f^\ (14) 

6=1 7T Q\Pb-> r lb) 

Pu(y = l\x)= [ F e (x)dU(9) * B-^F 6 ,,{x) l { fi]^ (15) 

where (p' b ,Vb)^ (PbiVb) ~ n and 6>£' = 8(p b ,rj'^). Let H n (x,y) and Qn(^) denote the right- 
hand sides of ( TT4l) and (|T5l) . respectively. Combining (|T3l) -( fT5l) gives 

£n^7, x; n) « B' 1 V {% 6 , x) + A [F n (x, 0)(1 - Q n (x)) + F n (x, l)Qn(rc)]} 

^ Mpb,Vb) 

(16) 

We can minimize the right-hand side of (IT5|) over x G [x min ,x max ] by using a bounded 
minimization routine in MATLAB. 

4.4 Simulation Study 

To compare the proposed procedure in Section 13.21 to EWOC, CRM, and the inverted 
overdose control (IVOC) design in Example 1, a simulation study was performed in the 
setting of the trial to determine the MTD of the antimetabolite 5-fluorouracil (5- FU) for 



Babb. Rogatko and Zacksl (119981 ). Based 



treating solid tumors in the colon, as described in 
on previous studies of 5-FU, a dose of 140 mg/m 2 of 5-FU was believed to be safe, and the 
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MTD was believed to be no greater than 425 mg/m 2 , thus the dose space was taken to be 
the interval [x min , x max ] = [140,425]. The two-parameter logistic model ([1]) was chosen based 
on previous experience with the agent, and the uniform distribution over [0,p] x [x m m,x max ] 
was chosen as the prior distribution Ilo for (p,r)), with p = 1/3. The feasibility bound of 
u = .25 was chosen, which was also used here for the IVOC weight 7 in fl3]). In a trial 
of length n = 24, Table [D compares EWOC that uses a linearly escalated feasibility bound 



(jBabb and Rogatko 



2004]), denoted by EWOC*, with IVOC, CRM, and the proposed design 
in Section [3T21 with h in ffTTl) given by the EWOC loss function (and denoted by EWOC+, in 
which + signifies an additional future patient considered by (ITT]) ), for two different values of 
the discount factor A in ( ITT]) . Each entry in the table was calculated from 10,000 simulated 
trials. The first set of rows is a Bayesian setting in which, for each replication, a pair (p, rj) is 
drawn from il , and the next three sets of rows are frequentist settings (denoted Freqi, Freq 2 , 
Freqs) where the true values (p, rj) are set at fixed values for all 10,000 replications; these 
three pairs of fixed values were drawn from Ilo. A comprehensive comparison of EWOC, 
C RM, sequential c-optima l, constrained D-optimal, ADP and other designs has been given 



by 



Bartroff and Lail (120101 ). who use approximate dynamic programming (ADP) to minimize 



the finite- horizon risk (JHJ). 

[Table 1 about here.] 

Table [TJ reports two different risk measures. Since the length of the trial is fixed at n = 24 
in the simulation study, Riski is the finite- horizon analog of (Q, which is (jHJ) with h given 
by the EWOC loss function (T2]) and no terminal loss (i.e., g = 0), and Risk 2 is the same risk 
function but with h given by the "inverted" loss function ([3]) . Also reported are the bias and 
RMSE (root mean squared error {E(rj n — f]) 2 } 1 ^ 2 ) of the terminal MTD estimate rj n (which 
is the mean of the terminal posterior distribution of rj), the DLT rate P(y = 1) (denoted 
DLT), the overdose rate P(x > rj) (denoted OD), the excess DLT rate E[Fq(x) — p} + (denoted 
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OD*), and the coherence violation rate (denoted ChV) 

n-l 

(n - ^2{P(yi = 0, x i+1 < Xi) + P{yi = 1, x i+1 > a*)}. 

i=l 

In these expressions, P and E denote the probability and expectation, respectively, with 
respect to the prior distribution in the Bayesian setting, or with respect to the appropriate 
fixed values of (p, rf) in the frequentist settings, and are computed by Monte Carlo. 

In terms of Risk! and Risk 2 , EWOC+ performs better in the Bayesian setting than the 
myopic designs EWOC*, IVOC, and CRM, in that order. This occurs in the frequentist 
settings as well, although the ordering of the myopic designs varies depending on the par- 
ticular parameter values. Even though it myopically minimizes the posterior risk at every 
stage, IVOC performs poorly in terms of the cumulative risk, Risk2, in the Bayesian setting. 
A possible explanation is that its loss function is a function of F$(x), whose posterior 
distribution (induced by the posterior distribution of 6) has relatively large variance toward 
the middle of the interval (0, 1) in which Fg(x) takes values and, in particular, near p = 1/3, 
resulting in low initial doses observed in the simulations. On the other hand, in the Freq3 
setting, where rj is relatively small and the dose-response curve is relatively flat (e.g., p 
large), IVOC performs well in terms of the risks. In terms of estimation, EWOC +i a=.4 has 
the smallest RMSE, with EWOC* and EWOC +i a=.i both comparable in the Bayesian setting. 
Moreover, EWOC + a=.4 has uniformly the smallest RMSE in the frequentist settings, with 
IVOC comparable to it in Freq 2 and IVOC and EWOC +i a=.i comparable to it in Freq3. 

5. Conclusion and Discussion 

In this paper we present a general formulation of Bayesian sequential design of Phase I 
cancer trials. This formulation enables us to prove a general coherence result in Theorem [I] 
applicable to any design that can be defined as the minimizer of the posterior risk when the 
loss function satisfies some mild conditions. Although the theorem is proved for the widely- 
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used logistic regression model (JTJ, the last paragraph of its proof in the Appendix shows that 
it is applicable to any dose-response model that is non-increasing in the MTD, such as the 
model Fg(x) = {(tanha; + l)/2} e , which is also popular. 

In Section 13.21 we propose a new design that incorporates both the individual ethics of 
the current patient begin administered the dose, through a given loss function such as the 
EWOC loss (j2j), and the collective ethics of all future patients by including an additional 
term in the overall loss function to represent the dose's information content for determining 
another dose for the next patient. The simulation study in Section WJ$\ shows that this new 
design is indeed an improvement over myopic designs in terms of global risk minimization, 
post-trial estimation of the MTD, and DLT and OD rates. This design provides a practical 
alternative to the optimal design associated with the intractable Markov decision problem of 
minimizing fl9]), which requires at each stage the daunting consideration of all future posterior 
distributions and c alculating their a s sociat ed optimal doses. For the finite-horizon problem 



of minimizing (jSJ), iBartroff and Lail (l2010f ) have developed an approximate solution which 
is a time- varying mixture of myopic and c-optimal designs. The new design in Section 13.2^ 
which can be described by a time-invariant functional of the posterior distribution at each 
stage, is substantially simpler computationally and provides substantial improvement over 
the myopic designs. We conjecture that with suitably chosen A (depending on 5), its global 
risk ([H]) can approximate that of the optimal design minimizing Instead of minimizing 
(Q directly, it may be possible to obtain a good lower bound for ([9]). Such a bound, which 
can provide a benchmark for assessing the proposed design, is a topic for future work. 

We also consider an "inverted" loss function ([3]), which measures deviation from the target 
DLT rate p on the probability scale rather than on the dose scale, and the associated myopic 
design IVOC. Even though IVOC minimizes the myopic posterior expected loss ([3]) at each 
stage, its cumulative global loss Risk2 in Tabled] is far from optimum, exceeding even that of 
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EWOC which uses a completely different loss function, in the Bayesian setting. On the other 
hand, the design proposed in Section [3T21 can be applied with the IVOC loss function (j3J) to 
yield a substantially improved design IVOC+. 



Appendix 

Proof of Theorem^ We prove coherence in de-escalation; the proof for escalation is similar. 
Let x min < x max be the boundaries of the dose space, which is assumed to be a finite interval. 
For fixed rj, since £{f],x) is a convex function of x, its right derivative £ x (i],x) with respect 
to x is nondecreasing for x min ^ x < x max , and the same is also true for the left derivative 
for x min < x ^ x max . Moreover, the left and right derivatives are equal and continuous 



Rockafellarl fll970L pages 214, 228, 244). Let 



except for at most countably many points; see 
in = argmin^ En£(ri, x), II be the posterior distribution obtained from II and the additional 
dose-response pair (x, y) = (xn,l), and let L(x) = Ejj£(r],x). Since £(rj,x) is convex in 
x for every rj, so is L(x); moreover, its right derivative is given by L + (x) = Ejj£ x (t],x). 
To show that x^ ^ xn, we shall assume that in < x max because the case xu = x max is 
trivial. It suffices to show that L + (xu) ^ because L is convex and has minimizer x^. Since 
E n £ x (v,xn) > and dfi(0) = F e (x n )dU(9)/ J F e ,(xn)dll(6'), recalling that (x,y) = (x n ,l), 
it follows that 

L+(xu) > Efj£ x (rj,x n ) - En£ x (r],x n ) 

J i x ( v ,xn)Fg(x n )dIl(e) J £ x ( V} x n )dU(9) 
J F e ,(x n )dll(9') JdU{6') 

= A J F e ix u )dU(9') , (17) 

where A = f f £ x (r], xn)[Fe(xn) — F d i(xn)]dH(9)dH(9'). A change of variables also yields 
A = -JJ Uv',xn)[F e (x n ) - F ei (x n )}dU(9)dU(9'). Hence 

2A = J J [U V ,x u ) - £ x (v',x n )][F e (x n ) - F ,(x u )}dU(9)dU(9') > 0, (18) 
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in which the inequality follows from 

%(V, xn) ~ £ x {rf, x n )][F $ {x n ) - F 9/ (x n )} ^ (19) 
for all x, 9 and 8', as will be shown below. Combining ( IT71) and ( {TBI yields L + (xu) ^ 0, 
completing the proof of the theorem. 

From the assumption that £(r], x) — £(j], x') is non- increasing in rj for any x > x', it follows 
that £ x (r],x) is non-increasing in r\ for fixed x. It therefore suffices for the proof of (I19p 
to show that Fq[x) is non- increasing in r\ = Since p^ 1 = 1/Fq{j]) = 1 + e~^ a+l3v \ 

Fq{x) = 1/[1 + exp{log(p _1 — 1) + (3i] — (3x}], which is non-increasing in rj since /3 > 0. 
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Table 1 

Riski, Risk2, bias and RMSE of the final MTD estimate, DLT rate, MTD overdoes rate (OD), excess DLT rate 
E[Fg(x) — p] + (OD*), and coherence violation rate (ChV), with SEs in parentheses, of various designs. 

Statistic EWOC* IVOC CRM EWOC +jA= . 1 EWOC. 



Bayesian: (p, 77) ~ n 



Riski 


485.5 (3.6) 


723.2 (3.6) 


986.1 (45.9) 


469.5 (3.0) 


454.8 (2.8) 


Risk 2 


1.03 (.01) 


1.44 (.02) 


1.54 (.02) 


.85 (.01) 


.73 (.007) 


Bias 


-9.67 (.6) 


-50.2 (.8) 


22.3 (1.6) 


2.42 (.7) 


-5.9 (.6) 


RMSE 


61.5 (.7) 


75.8 (.2) 


157.3 (.8) 


66.9 (.2) 


58.6 (.7) 


DLT (%) 


33.5 (.001) 


26.6 (9xHT 4 ) 


39.1 (.001) 


29.3 (.0009) 
29.6 (9xl0~ 4 ) 


29.1 (9xl0- 4 ) 


OD (%) 


37.4 (.001) 


17.5 (8xl0~ 4 ) 


55.6 (.001) 


27.0 (9xl0~ 4 ) 


OD* 


.043 (2xl0~ 4 ) 


.043 (3xl0~ 4 ) 


.078 (3xl0~ 4 ) 


.029 (2xl0^ 4 ) 


.021 (lxl0~ 4 ) 


ChV (%) 


0(0) 


.6 (.04) 


0(0) 


15.0 (8xl0~ 4 ) 


14.4 (8xl0~ 4 ) 



Freqi: p = .07, rj = 403.9 



Riski 


585.6 (1.3) 


1302.2 (.8) 


368.8 (1.1) 


175.9 (1.0) 


170.3 (1.1) 


Risk 2 


.78 (.002) 


1.40 (6xl0~ 4 ) 


.53 (.001) 


.29 (.001) 


.25 (.001) 


Bias 


-51.1 (.2) 


-151.7 (.2) 


-30.1 (.4) 


-23.2 (.3) 


-48.7 (.2) 


RMSE 


58.8 (.8) 


156.1 (.6) 


44.9 (.1) 


49.0 (.5) 


19.2 (.4) 


DLT (%) 


20.6 (8xl0~ 4 ) 


10.3 (7xl0~ 4 ) 


24.9 (9xl0~ 4 ) 


18.7 (8xl0~ 4 ) 


18.6 (8xl0~ 4 ) 


OD (%) 


0(0) 


0(0) 


2.3 (3xl0~ 4 ) 
4xl0" 4 (lxl0~ 5 ) 


1.2 (2xHT 4 ) 


1.1 (2xl0" 4 ) 


OD* 


0(0) 


0(0) 


.001 (lxl0~ 5 ) 


.001 (lxlO^ 5 ) 


ChV (%) 


0(0) 


0(0) 


0(0) 


0(0) 


0(0) 



Freq 2 : p = .19, r] = 269.1 



Riski 


402.8 (.8) 


423.1 (.9) 


313.7 


275.0 (3.7) 


266.2 (3.8) 


Risk 2 


.53 (.003) 


.49 (.001) 


.99 (.006) 


.43 (.001) 


.26 (.001) 


Bias 


15.0 (.5) 


-28.9 (.1) 


32.5 (.4) 


17.3 (.7) 


12.3 (.4) 


RMSE 


47.5 (.2) 


32.0 (.7) 


52.1 (.7) 


38.2 (.3) 


25.2 (.3) 


DLT (%) 


32.6 (.001) 


25.4 (9xKT 4 ) 


38.2 (.001) 


28.2 (9xl0- 4 ) 


24.8 (9xl0- 4 ) 


OD (%) 


32.8 (.001) 
.02 (7xl0~ 5 ) 
4.3 (4xl0^ 4 ) 


0(0) 


83.1 (8xl0- 4 ) 
.053 (8xl0- 5 ) 


27.1 (9xl0^ 4 ) 
.03 (4xl0- 5 ) 


26.9 (9xl0- 4 ) 
.01 (4xl0- 5 ) 


OD* 


0(0) 


ChV (%) 


3.1 (4xHT 4 ) 


0(0) 


17.6 (8xl0- 4 ) 


12.7 (7xl0- 4 ) 



Freq 3 : p = .30,r] = 226.7 



Riski 


674.3 (4.9) 


232.1 (2.9) 
.09 (4xl0" 4 ) 


1444.9 (6.3) 


158.9 (4.9) 
.05 (4xl0" 4 ) 


146.1 (3.5) 
.05 (6xl0" 4 ) 


Risk 2 


.27 (.002) 


.59 (.003) 


Bias 


42.9 (.62) 


10.3 (.14) 


81.4 (.5) 


54.9 (.54) 


47.4 (.49) 


RMSE 


74.2 (.7) 


17.4 (.5) 


96.8 (.8) 


18.0 (.5) 


13.2 (.2) 


DLT (%) 


34.2 (.001) 


32.7 (.001) 


36.6 (.001) 


35.8 (.001) 


35.7 (.001) 


OD (%) 


73.2 (.001) 


54.3 (.001) 


93.5 (5xl0~ 4 ) 
.032 (4xl0~ 5 ) 


88.8 (6xl0~ 4 ) 
.031 (4xl0" 5 ) 


81.0 (6xl0~ 4 ) 
.030 (4xl0" 5 ) 
9.6 (6xl0" 4 ) 


OD* 


.014 (3xl0~ 5 ) 


.002 (2xHT 5 ) 


ChV (%) 


0(0) 


3.9 (5xl0" 4 ) 


0(0) 


10.0 (6xl0~ 4 ) 



